Edinburg
Bitcoin Price Forecasting Based on Hybrid Variational Mode Decomposition and Long Short Term Memory Network
This study proposes a hybrid deep learning model for forecasting the price of Bitcoin, as the digital currency is known to exhibit frequent fluctuations. The models used are the Variational Mode Decomposition (VMD) and the Long Short-Term Memory (LSTM) network. First, VMD is used to decompose the original Bitcoin price series into Intrinsic Mode Functions (IMFs). Each IMF is then modeled using an LSTM network to capture temporal patterns more effectively. The individual forecasts from the IMFs are aggregated to produce the final prediction of the original Bitcoin Price Series. To determine the prediction power of the proposed hybrid model, a comparative analysis was conducted against the standard LSTM. The results confirmed that the hybrid VMD+LSTM model outperforms the standard LSTM across all the evaluation metrics, including RMSE, MAE and R2 and also provides a reliable 30-day forecast.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.06)
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- Europe > Czechia > Prague (0.04)
- (2 more...)
An Unsupervised Time Series Anomaly Detection Approach for Efficient Online Process Monitoring of Additive Manufacturing
Cantu, Frida, Ibarra, Salomon, Gonzales, Arturo, Barreda, Jesus, Liu, Chenang, Zhang, Li
Abstract-- Online sensing plays an important role in advancing modern manufacturing. The real-time sensor signals, which can be stored as high-resolution time series data, contain rich information about the operation status. One of its popular usages is online process monitoring, which can be achieved by effective anomaly detection from the sensor signals. However, most existing approaches either heavily rely on labeled data for training supervised models, or are designed to detect only extreme outliers, thus are ineffective at identifying subtle semantic off-track anomalies to capture where new regimes or unexpected routines start. T o address this challenge, we propose an matrix profile-based unsupervised anomaly detection algorithm that captures fabrication cycle similarity and performs semantic segmentation to precisely identify the onset of defect anomalies in additive manufacturing. The effectiveness of the proposed method is demonstrated by the experiments on real-world sensor data.
- North America > United States > Oklahoma > Payne County > Stillwater (0.14)
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- Machinery > Industrial Machinery (0.88)
- Information Technology (0.69)
Adaptive von Mises-Fisher Likelihood Loss for Supervised Deep Time Series Hashing
Perez, Juan Manuel, Garcia, Kevin, Berry, Brooklyn, Song, Dongjin, Gao, Yifeng
Indexing time series by creating compact binary representations is a fundamental task in time series data mining. Recently, deep learning-based hashing methods have proven effective for indexing time series based on semantic meaning rather than just raw similarity. The purpose of deep hashing is to map samples with the same semantic meaning to identical binary hash codes, enabling more efficient search and retrieval. Unlike other supervised representation learning methods, supervised deep hashing requires a discretization step to convert real-valued representations into binary codes, but this can induce significant information loss. In this paper, we propose a von Mises-Fisher (vMF) hashing loss. The proposed deep hashing model maps data to an M-dimensional hyperspherical space to effectively reduce information loss and models each data class as points following distinct vMF distributions. The designed loss aims to maximize the separation between each modeled vMF distribution to provide a better way to maximize the margin between each semantically different data sample. Experimental results show that our method outperforms existing baselines. The implementation is publicly available at https://github.com/jmpq97/vmf-hashing
- North America > United States > Connecticut > Tolland County > Storrs (0.14)
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- North America > Canada (0.04)
- Europe > Greece (0.04)
- Health & Medicine (0.47)
- Information Technology (0.46)
- Government > Regional Government > North America Government > United States Government (0.46)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Enhancing Credit Default Prediction Using Boruta Feature Selection and DBSCAN Algorithm with Different Resampling Techniques
Ampomah, Obu-Amoah, Agyemang, Edmund, Acheampong, Kofi, Agyekum, Louis
This study examines credit default prediction by comparing three techniques, namely SMOTE, SMOTE-Tomek, and ADASYN, that are commonly used to address the class imbalance problem in credit default situations. Recognizing that credit default datasets are typically skewed, with defaulters comprising a much smaller proportion than non-defaulters, we began our analysis by evaluating machine learning (ML) models on the imbalanced data without any resampling to establish baseline performance. These baseline results provide a reference point for understanding the impact of subsequent balancing methods. In addition to traditional classifiers such as Naive Bayes and K-Nearest Neighbors (KNN), our study also explores the suitability of advanced ensemble boosting algorithms, including Extreme Gradient Boosting (XGBoost), AdaBoost, Gradient Boosting Machines (GBM), and Light GBM for credit default prediction using Boruta feature selection and DBSCAN-based outlier detection, both before and after resampling. A real-world credit default data set sourced from the University of Cleveland ML Repository was used to build ML classifiers, and their performances were tested. The criteria chosen to measure model performance are the area under the receiver operating characteristic curve (ROC-AUC), area under the precision-recall curve (PR-AUC), G-mean, and F1-scores. The results from this empirical study indicate that the Boruta+DBSCAN+SMOTE-Tomek+GBM classifier outperformed the other ML models (F1-score: 82.56%, G-mean: 82.98%, ROC-AUC: 90.90%, PR-AUC: 91.85%) in a credit default context. The findings establish a foundation for future progress in creating more resilient and adaptive credit default systems, which will be essential as credit-based transactions continue to rise worldwide.
- North America > United States > Michigan > Kalamazoo County > Kalamazoo (0.04)
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
- Banking & Finance > Credit (0.70)
- Information Technology (0.68)
- Banking & Finance > Loans (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Prediction of Coffee Ratings Based On Influential Attributes Using SelectKBest and Optimal Hyperparameters
Agyemang, Edmund, Agbota, Lawrence, Agbenyeavu, Vincent, Akabuah, Peggy, Bimpong, Bismark, Attafuah, Christopher
This study explores the application of supervised machine learning algorithms to predict coffee ratings based on a combination of influential textual and numerical attributes extracted from user reviews. Through careful data preprocessing including text cleaning, feature extraction using TF-IDF, and selection with SelectKBest, the study identifies key factors contributing to coffee quality assessments. Six models (Decision Tree, KNearest Neighbors, Multi-layer Perceptron, Random Forest, Extra Trees, and XGBoost) were trained and evaluated using optimized hyperparameters. Model performance was assessed primarily using F1-score, Gmean, and AUC metrics. Results demonstrate that ensemble methods (Extra Trees, Random Forest, and XGBoost), as well as Multi-layer Perceptron, consistently outperform simpler classifiers (Decision Trees and K-Nearest Neighbors) in terms of evaluation metrics such as F1 scores, G-mean and AUC. The findings highlight the essence of rigorous feature selection and hyperparameter tuning in building robust predictive systems for sensory product evaluation, offering a data driven approach to complement traditional coffee cupping by expertise of trained professionals.
- South America > Brazil (0.04)
- Africa > Ethiopia (0.04)
- South America > Colombia (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.97)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Benchmarking Foundation Speech and Language Models for Alzheimer's Disease and Related Dementia Detection from Spontaneous Speech
Li, Jingyu, Mao, Lingchao, Wang, Hairong, Wang, Zhendong, Mao, Xi, Ni, Xuelei Sherry
Background: Alzheimer's disease and related dementias (ADRD) are progressive neurodegenerative conditions where early detection is vital for timely intervention and care. Spontaneous speech contains rich acoustic and linguistic markers that may serve as non-invasive biomarkers for cognitive decline. Foundation models, pre-trained on large-scale audio or text data, produce high-dimensional embeddings encoding contextual and acoustic features. Methods: We used the PREPARE Challenge dataset, which includes audio recordings from over 1,600 participants with three cognitive statuses: healthy control (HC), mild cognitive impairment (MCI), and Alzheimer's Disease (AD). We excluded non-English, non-spontaneous, or poor-quality recordings. The final dataset included 703 (59.13%) HC, 81 (6.81%) MCI, and 405 (34.06%) AD cases. We benchmarked a range of open-source foundation speech and language models to classify cognitive status into the three categories. Results: The Whisper-medium model achieved the highest performance among speech models (accuracy = 0.731, AUC = 0.802). Among language models, BERT with pause annotation performed best (accuracy = 0.662, AUC = 0.744). ADRD detection using state-of-the-art automatic speech recognition (ASR) model-generated audio embeddings outperformed others. Including non-semantic features like pause patterns consistently improved text-based classification. Conclusion: This study introduces a benchmarking framework using foundation models and a clinically relevant dataset. Acoustic-based approaches -- particularly ASR-derived embeddings -- demonstrate strong potential for scalable, non-invasive, and cost-effective early detection of ADRD.
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- North America > United States > Texas > Cameron County > Brownsville (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
MTBench: A Multimodal Time Series Benchmark for Temporal Reasoning and Question Answering
Chen, Jialin, Feng, Aosong, Zhao, Ziyu, Garza, Juan, Nurbek, Gaukhar, Qin, Cheng, Maatouk, Ali, Tassiulas, Leandros, Gao, Yifeng, Ying, Rex
Understanding the relationship between textual news and time-series evolution is a critical yet under-explored challenge in applied data science. While multimodal learning has gained traction, existing multimodal time-series datasets fall short in evaluating cross-modal reasoning and complex question answering, which are essential for capturing complex interactions between narrative information and temporal patterns. To bridge this gap, we introduce Multimodal Time Series Benchmark (MTBench), a large-scale benchmark designed to evaluate large language models (LLMs) on time series and text understanding across financial and weather domains. MTbench comprises paired time series and textual data, including financial news with corresponding stock price movements and weather reports aligned with historical temperature records. Unlike existing benchmarks that focus on isolated modalities, MTbench provides a comprehensive testbed for models to jointly reason over structured numerical trends and unstructured textual narratives. The richness of MTbench enables formulation of diverse tasks that require a deep understanding of both text and time-series data, including time-series forecasting, semantic and technical trend analysis, and news-driven question answering (QA). These tasks target the model's ability to capture temporal dependencies, extract key insights from textual context, and integrate cross-modal information. We evaluate state-of-the-art LLMs on MTbench, analyzing their effectiveness in modeling the complex relationships between news narratives and temporal patterns. Our findings reveal significant challenges in current models, including difficulties in capturing long-term dependencies, interpreting causality in financial and weather trends, and effectively fusing multimodal information.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- (4 more...)
- Banking & Finance > Trading (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
Exploring Transfer Learning for Deep Learning Polyp Detection in Colonoscopy Images Using YOLOv8
Vazquez, Fabian, Nuñez, Jose Angel, Fu, Xiaoyan, Gu, Pengfei, Fu, Bin
Deep learning methods have demonstrated strong performance in objection tasks; however, their ability to learn domain-specific applications with limited training data remains a significant challenge. Transfer learning techniques address this issue by leveraging knowledge from pre-training on related datasets, enabling faster and more efficient learning for new tasks. Finding the right dataset for pre-training can play a critical role in determining the success of transfer learning and overall model performance. In this paper, we investigate the impact of pre-training a YOLOv8n model on seven distinct datasets, evaluating their effectiveness when transferred to the task of polyp detection. We compare whether large, general-purpose datasets with diverse objects outperform niche datasets with characteristics similar to polyps. In addition, we assess the influence of the size of the dataset on the efficacy of transfer learning. Experiments on the polyp datasets show that models pre-trained on relevant datasets consistently outperform those trained from scratch, highlighting the benefit of pre-training on datasets with shared domain-specific features.
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- Asia > China > Fujian Province > Fuzhou (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.66)
Deep Learning for Early Alzheimer Disease Detection with MRI Scans
Rafsan, Mohammad, Oraby, Tamer, Roy, Upal, Kumar, Sanjeev, Rodrigo, Hansapani
Alzheimer's Disease is a neurodegenerative condition characterized by dementia and impairment in neurological function. The study primarily focuses on the individuals above age 40, affecting their memory, behavior, and cognitive processes of the brain. Alzheimer's disease requires diagnosis by a detailed assessment of MRI scans and neuropsychological tests of the patients. This project compares existing deep learning models in the pursuit of enhancing the accuracy and efficiency of AD diagnosis, specifically focusing on the Convolutional Neural Network, Bayesian Convolutional Neural Network, and the U-net model with the Open Access Series of Imaging Studies brain MRI dataset. Besides, to ensure robustness and reliability in the model evaluations, we address the challenge of imbalance in data. We then perform rigorous evaluation to determine strengths and weaknesses for each model by considering sensitivity, specificity, and computational efficiency. This comparative analysis would shed light on the future role of AI in revolutionizing AD diagnostics but also paved ways for future innovation in medical imaging and the management of neurodegenerative diseases.
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Texas > Hidalgo County > Edinburg (0.04)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
SouLLMate: An Application Enhancing Diverse Mental Health Support with Adaptive LLMs, Prompt Engineering, and RAG Techniques
Guo, Qiming, Tang, Jinwen, Sun, Wenbo, Tang, Haoteng, Shang, Yi, Wang, Wenlu
Mental health issues significantly impact individuals' daily lives, yet many do not receive the help they need even with available online resources. This study aims to provide diverse, accessible, stigma-free, personalized, and real-time mental health support through cutting-edge AI technologies. It makes the following contributions: (1) Conducting an extensive survey of recent mental health support methods to identify prevalent functionalities and unmet needs. (2) Introducing SouLLMate, an adaptive LLM-driven system that integrates LLM technologies, Chain, Retrieval-Augmented Generation (RAG), prompt engineering, and domain knowledge. This system offers advanced features such as Risk Detection and Proactive Guidance Dialogue, and utilizes RAG for personalized profile uploads and Conversational Information Extraction. (3) Developing novel evaluation approaches for preliminary assessments and risk detection via professionally annotated interview data and real-life suicide tendency data. (4) Proposing the Key Indicator Summarization (KIS), Proactive Questioning Strategy (PQS), and Stacked Multi-Model Reasoning (SMMR) methods to enhance model performance and usability through context-sensitive response adjustments, semantic coherence evaluations, and enhanced accuracy of long-context reasoning in language models. This study contributes to advancing mental health support technologies, potentially improving the accessibility and effectiveness of mental health care globally.
- North America > United States > Missouri > Boone County > Columbia (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > Canada (0.04)
- (6 more...)
- Research Report (1.00)
- Overview (1.00)